perm filename NETDOC.PLN[ESS,JMC]1 blob sn#019557 filedate 1973-01-13 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	PLAN FOR KEEPING DOCUMENTS IN THE NETWORK
C00009 ENDMK
CāŠ—;
PLAN FOR KEEPING DOCUMENTS IN THE NETWORK


	At the ARPA IPT principal investigators' meeting, 8-10 January,
John McCarthy was charged with preparing proposals concerning keeping
documents in the network.  As I see it, the issues include the following,
most of which were raised by various people at the IPT meeting and at
the meeting a year ago.

	1. It now costs no more to store a book on either of the terabit
files than to keep it on a shelf.  Therefore, the storage technology is
now adequate to move towards having all the world's literature in a
computer accessible form.  It costs about $5.00 per month marginal cost
to store one more book on the IBM 3330 at the Stanford AI Lab.  The
display terminal situation is not quite so favorable, but many of the
labs are acquiring systems usable for reading.

	2. It should be possible to identify Defense Department needs for
on-line document storage and reading.  It was said at the meeting that
some part of the DoD had RFPed a system for keeping the technical manuals
for obsolete equipment on-line.  It is very important to do all this
in a uniform way so that the eventual goal that all literature be accessible
from everywhere be achieved.  Otherwise, a situation might arise where a
part of the Armed Services will need more than one system of display
consoles because they subscribe to two incompatible information systems.

	3. A prototype of such a standard system can be an ARPA network
system provided we pay due attention to generality.  Already, when files
are transferred between sites on the net various troubles occur.  Here are
some of them.
	a. ASCII made a minor change some years ago, and Stanford has the
old characters and M.I.T. the new.
	b. Different filler characters are used.
	c. The low order bit is treated differently.
	d. Stanford has an extended character set.
	e. NLS text files require a specific hierarchical structure not
used by anyone else.
	These are only the incompatibilities that have accidentally come
to my attention.  Those who use IBM machines have different problems, and
because different balls are used on Selectrics, the IBM users have a
tendency to use the same character codes for different characters in
different situations.

	4. One problem is that of extending the character set beyond that
the 96 character ASCII which is the largest commonly used.  I propose that
we go to an arbitrary character set rather than make a fixed extension.  An
old NIC memo advocating a way of doing this is hereby withdrawn, because I
now know a better way.  More about this later.

	5. Various people including McCarthy and Frank Hart have proposed that
the IPT acquire a network facility for optical text reading that will be
adequate for reading in books, articles, and other documents from the existing
literature.  Naturally, Information International Inc. in L.A. and possibly
others would dearly like to be paid to provide such a facility.  I think
it should be done provided the cost is not too great.  Perhaps, the best 
justification is that experience is necessary before DoD can make general
use of the technology.  Otherwise, incompatible specialized systems are
likely to be generated that will be very expensive to co-ordinate.

	It has been pointed out by many that before the country undertakes
a giant project to read in the world's literature, we should make more use
of the fact that much present literature is produced in computer-readable
form.  Pursuing that idea, we looked into the possibility of getting the
computer science literature produced in computer-readable form.  Neither
the ACM nor the IEEE now do this and the ACM is resistant to the suggestion
at present for reasons that I consider bad.

	At my initiative, the ACM Special Interest Group on Artificial
Intelligence is producing its Newsletter in the ARPA net using NLS which
is convenient, because the editors are at SRI.

	Before producing definite proposals, more information is needed at
least about the following questions:

	a. What is the state of the network text file transfer protocol?

	b. What is the state of the network display protocol?

	c. What DoD needs can be identified for network-type access to
large collections of documents?